Distributed Vertex-Cut Partitioning
نویسندگان
چکیده
Graph processing has become an integral part of big data analytics. With the ever increasing size of the graphs, one needs to partition them into smaller clusters, which can be managed and processed more easily on multiple machines in a distributed fashion. While there exist numerous solutions for edge-cut partitioning of graphs, very little effort has been made for vertex-cut partitioning. This is in spite of the fact that vertex-cuts are proved significantly more effective than edge-cuts for processing most real world graphs. In this paper we present Ja-beJa-vc, a parallel and distributed algorithm for vertex-cut partitioning of large graphs. In a nutshell, Ja-be-Ja-vc is a local search algorithm that iteratively improves upon an initial random assignment of edges to partitions. We propose several heuristics for this optimization and study their impact on the final partitioning. Moreover, we employ simulated annealing technique to escape local optima. We evaluate our solution on various graphs and with variety of settings, and compare it against two state-of-the-art solutions. We show that Ja-be-Ja-vc outperforms the existing solutions in that it not only creates partitions of any requested size, but also requires a vertex-cut that is better than its counterparts and more than 70% better than random partitioning.
منابع مشابه
Multiply Balanced k -Partitioning
The problem of partitioning an edge-capacitated graph on n vertices into k balanced parts has been amply researched. Motivated by applications such as load balancing in distributed systems and market segmentation in social networks, we propose a new variant of the problem, called Multiply Balanced k Partitioning, where the vertex-partition must be balanced under d vertex-weight functions simult...
متن کاملDistributed Power-law Graph Computing Distributed Power-law Graph Computing: Theoretical and Empirical Analysis∗
Typically, a large-scale natural graph follows a skewed power law. In distributed graphstructured computations, the skewness usually makes a bad partitioning, which leads to high communication cost and workload imbalance. Therefore, graph partitioning (GP) is a challenging issue. To tackle this challenge, we introduce degree-based techniques into GP via vertex-cut. Accordingly, we develop a nov...
متن کاملSBV-Cut: Vertex-cut based graph partitioning using structural balance vertices
Article history: Received 6 January 2011 Received in revised form 8 November 2011 Accepted 8 November 2011 Available online 21 November 2011 Graphs are used for modeling a large spectrum of data from the web, to social connections between individuals, to concept maps and ontologies. As the number and complexities of graph based applications increase, rendering these graphs more compact, easier ...
متن کاملADWISE: Adaptive Window-based Streaming Edge Partitioning for High-Speed Graph Processing
In recent years, the graph partitioning problem gained importance as a mandatory preprocessing step for distributed graph processing on very large graphs. Existing graph partitioning algorithms minimize partitioning latency by assigning individual graph edges to partitions in a streaming manner — at the cost of reduced partitioning quality. However, we argue that the mere minimization of partit...
متن کاملAdaptive Partitioning of Large-Scale Dynamic Graphs
In the last years, large-scale graph processing has gained increasing attention, with most recent systems placing particular emphasis on latency. One possible technique to improve runtime performance in a distributed graph processing system is to reduce network communication. The most notable way to achieve this goal is to partition the graph by minimizing the number of edges that connect verti...
متن کامل